COVID-19 hospitalization and mortality and hospitalization-related utilization and expenditure: Analysis of a South African private health insured population

Background Evidence on the risk factors for COVID-19 hospitalization, mortality, hospital stay and cost of treatment in the African context is limited. This study aims to quantify the impact of known risk factors on these outcomes in a large South African private health insured population. Methods and findings This is a cross sectional analytic study based on the analysis of the records of members belonging to health insurances administered by Discovery Health (PTY) Ltd. Demographic data for 188,292 members who tested COVID-19 positive over the period 1 March 2020–28 February 2021 and the hospitalization data for these members up until 30 June 2021 were extracted. Logistic regression models were used for hospitalization and death outcomes, while length of hospital stay and (log) cost per patient were modelled by negative binominal and linear regression models. We accounted for potential differences in the population served and the quality of care within different geographic health regions by including the health district as a random effect. Overall hospitalization and mortality risk was 18.8% and 3.3% respectively. Those aged 65+ years, those with 3 or more comorbidities and males had the highest hospitalization and mortality risks and the longest and costliest hospital stays. Hospitalization and mortality risks were higher in wave 2 than in wave 1. Hospital and mortality risk varied across provinces, even after controlling for important predictors. Hospitalization and mortality risks were the highest for diabetes alone or in combination with hypertension, hypercholesterolemia and ischemic heart disease. Conclusions These findings can assist in developing better risk mitigation and management strategies. It can also allow for better resource allocation and prioritization planning as health systems struggle to meet the increased care demands resulting from the pandemic while having to deal with these in an ever-more resource constrained environment.


Methods and findings
This is a cross sectional analytic study based on the analysis of the records of members belonging to health insurances administered by Discovery Health (PTY) Ltd. Demographic data for 188,292 members who tested COVID-19 positive over the period 1 March 2020-28 February 2021 and the hospitalization data for these members up until 30 June 2021 were extracted. Logistic regression models were used for hospitalization and death outcomes, while length of hospital stay and (log) cost per patient were modelled by negative binominal and linear regression models. We accounted for potential differences in the population served and the quality of care within different geographic health regions by including the health district as a random effect. Overall hospitalization and mortality risk was 18.8% and 3.3% respectively. Those aged 65+ years, those with 3 or more comorbidities and males had the highest hospitalization and mortality risks and the longest and costliest hospital stays. Hospitalization and mortality risks were higher in wave 2 than in wave 1. Hospital and mortality risk varied across provinces, even after controlling for important predictors. Hospitalization and mortality risks were the highest for diabetes alone or in combination with hypertension, hypercholesterolemia and ischemic heart disease. Introduction COVID-19 was declared a Public Health Emergency of International Concern on 30 January 2020, and a pandemic on 11 March 2020 by the World Health Organization (WHO) [1,2].
There is now a body of evidence on the risk factors for COVID-19 related hospitalization and mortality [3][4][5][6][7][8][9][10][11] from studies in North America, Europe and China. A systematic review and meta-analysis of 102 papers covering 121,437 infected patients found comorbidities such as hypertension, diabetes, cardiovascular diseases, and chronic kidney disease were associated with the severity of COVID-19 infection [8]. Particularly, the elderly and males with underlying diseases were more likely to have severe COVID-19. Of the 102 papers included in the review, 80 were from Asia, 15 from Europe, 11 from North America and 1 from South America but none from Africa. A second systematic review and data synthesis of COVID-19 length of hospital stay across 52 studies, found that patients with COVID-19 in China remained in hospital longer than elsewhere [9]. None of the systematic reviews published to date have included any studies from Africa. The relevance of these studies to the broader African and sub-Saharan context is uncertain given the underlying demographic and disease profile differences between the regions [12]. Evidence on the risk factors in the African or South African context is limited with a review of the literature revealing only three published studies [13][14][15] which reported on the risk factors for COVID-19 related mortality. Two of these studies involved only public sector patients where HIV and tuberculosis are important risk factors for death. South African studies that have included data from private hospitals have included public vs private sector as a predictor variable rather than exploring outcomes separately in the two sectors (for example by stratifying on health care sector) [16].
Little is known about COVID-19 risk factors in the private sector population or how they may impact on hospital related utilization and expenditure patterns. The for-profit private sector is an important provider of health services in the sub-Saharan African region. A recent WHO report estimates the percentage of health services sourced from private providers in the AFRO region at 40% [17]. The South African health system is highly fragmented with substantial disparities in access, facilities and spending between the government-funded public health system and the private health system. Around 18% of the total South African population is covered by private voluntary health insurance [18] which is the predominant funding mechanism for the private health system. Access to private health services depends mainly on the ability to pay and there are stark racial and socio-economic differences in utilization of public compared to private health services. As of 2018, only 10% of black Africans were members of private medical schemes compared to 73% of white South Africans [19].
It is expected that there are differences in severe disease risk and hospitalization between the public and private sector populations and to date only two studies in South Africa have included data from the private sector [14,20] which found a lower overall case-fatality risk compared to the public sector but did not explore underlying risk factors in the two populations. Evidence on the risk factors for COVID-19 related hospitalization, health care utilization and expenditure patterns are also limited and further evidence is required to confirm and better understand the patterns in this regard.
Relying largely on a fee for service model for provider payment and established clinical coding system, the private voluntary health insurance model generates substantial data enabling analysis of utilization, risk and expenditure of beneficiaries which can provide valuable insights. In South Africa, this approach has been taken to investigate for example, use of antibiotics [21], take-up of influenza vaccines [22] and caesarean section rates [23]. Elsewhere, researchers in the Republic of Korea have used insurance administrative datasets to investigate comorbidities and factors determining medical expenses and length of stay for COVID-19 patients admitted to hospital [24].
The aim of this study was to assess and quantify the impact of known risk factors on COVID-19 hospitalization, hospital related utilization and expenditure, and mortality in a large South African private health insured population over a 12 month period. The results from this study will contribute to addressing the gap in the knowledge base on the actual observed risks and subsequent hospitalization using real world data (RWD). It will enable targeted patient management strategies and risk stratification, identification of opportunities for provider quality and efficiency improvements and will generate information to assess the cost and cost effectiveness of preventative and treatment interventions for patients with COVID-19.

Study design
This is a cross sectional analytic study based on the analysis of the demographic and claims records of members belonging to medical schemes administered by Discovery Health (PTY) Ltd (DH), one of the largest health insurance administrators in South Africa.

Study population
The study population consisted of 3.5 million individuals from 1.7 million households sharing the same health insurance policy. These policy holders belonged to 19 different health insurance schemes administered by DH, representing around a third of South Africa's privately insured population. The average family size, average contributions and health care expenditure of the study population were compared to that of the broader health insured population and found to be comparable ensuring that the findings of this study are generalisable to the broader South African population with private health insurance (S1 Table) [25].

Data sources
Secondary de-identified demographic and claims data of members belonging to medical schemes administered by DH for the period from 1 March 2020 to 30 June 2021 was extracted from the data warehouse of Quantium Health, an independent company that provides data analytics and strategic consulting services to DH.
For each insured individual, the data contains the following variables: unique study individual identifier, date of birth, sex and province. For each claim submitted to the administrator for reimbursement of services rendered or items dispensed to an insured individual the data contains the following variables: a unique study individual identifier; dates for the commencement and completion of the service; a code and description for each service rendered/item dispensed, an ICD-10 (10th revision of the International Statistical Classification of Diseases and Related Health Problems) code for the diagnosis of the condition being treated; a Current Procedural Terminology (CPT) code for the procedure carried out; a National Pharmaceutical Product Index (NAPPI) code for any surgical, medical or consumable item dispensed; and the amount being claimed.

Data extraction and classification
The approach used to extract and classify the data is schematically summarized in Fig 1. From all the data for the period, a 3-step approach was used to extract the data. For the first step, individuals who had tested positive for COVID-19 through either the PCR, PKR or real-time RT-PCR tests in the period from 1 March 2020 to 28 February 2021 (study period) were identified and a "demographic extract" consisting of demographic, comorbidity and status elements was extracted for these individuals. For the comorbidity variable, the following conditions were considered as comorbidity risk factors for COVID-19 based on a review of published literature, consultation with the South African-based medical experts overseeing utilisation management at the health insurance, as well as consideration of the health profile of private sector patients in South Africa: Cancer, Chronic Renal Disease, Congestive Cardiac Failure, Chronic Obstructive Pulmonary Disease, Diabetes Mellitus, HIV, Hypercholesterolaemia, Hypertension, Hypothyroidism, Ischaemic Heart Disease, Pregnancy, Tuberculosis. The individuals with these comorbidities were identified using the South African Council for Medical Scheme Guideline algorithms for identifying members with medical conditions using claims records [26].
For the second step, the claims of the COVID-19 positive individuals over the period from 1 February 2020 to 30 June 2021 were assessed to determine whether they had been hospitalized for the treatment of COVID-19 and a hospital admission indicator was created and added to the demographic extract. Although only individuals who tested positive over the period from 1 March 2020 to 28 February 2021 were included in the study sample, the claims up to 30 June 2021 were included in hospitalization analysis to ensure that the data is not "right censored" as there is lag between testing and hospitalization and between hospitalization and the claims being received by the administrator.
For the third step, for all those COVID-19 positive individuals who had been hospitalized, a "hospital admissions" extract was created consisting of claims, length of stay and treatment marker elements.
To calculate the total cost per hospitalized COVID-19 patient we considered all claims for the hospitalization event including costs for pharmaceuticals, hospital bed charges, consumables, radiology services, general medical practitioner and specialist medical practitioner claims.

Statistical analysis
Four COVID-19 outcomes were analyzed. Two of the outcomes were binary, namely a) whether the patient was hospitalized and b) whether the patient died (here, we assumed all deaths amongst these patients were due to or exacerbated by . For the COVID-19 patients who were admitted to a hospital, two further outcomes were analyzed, namely c) length of stay (in days) in the hospital and (d) total cost of claims per patient.
The predictor variables considered in the analysis included age (at time of diagnosis); sex; number of commodities; pandemic wave: (pre-wave 1 (1 March 2020-6 June 2020), wave 1 (7 June 2020-22 August 2020), post wave 1 (23 August 2020-14 November 2020), Wave 2 (15 November 2020 -end Feb 2021)); province: (Eastern Cape, Free State, Gauteng, KwaZulu-Natal, Limpopo, Mpumalanga, North West, Northern Cape, Western Cape); health insurance cover level: (1, 2, 3, 4 where cover level 1 plans offered the lowest level of benefits and cover level 4 plans offered the highest level of benefits) and hospital network: (the six main private hospital networks: A-F). A classification system was used for plans and provider networks to enable blinding of the actual names of plan types and specific hospital providers which is proprietary information. The data are grouped into 19 health regions for the insurance company administrative purposes. In our analyses, the health region was chosen for the random effects to account for potential differences in the population served and the quality of care within different geographic health regions.
Summary statistics included frequencies and percentages for categorical data, and for continuous data median and interquartile range were used. For modelling purposes, two-level random-effects logistic regression models were used for the two binary outcomes, where the level-2 unit was the health region.
Exploratory analysis showed that a Poisson model was insufficient to model the length of hospital stay as the data exhibited overdispersion in the sense that its variance exceeded its mean. Thus, a random effects negative binomial regression model was used for the number of days spent in a hospital and we accounted for health region variation as well as overdispersion. Unadjusted Incidence Rate Ratios (IRR) and adjusted Incidence Rate Ratios (aIRR) are presented.
The total cost data was heavily skewed to the right, and upon taking the logarithm of it, the transformed total cost had a "normal' shape. Thus, a linear mixed regression model on the log of total cost, again using the health region as a clustering level, was used.
Rather than presenting the estimated coefficients (which are increases in log costs per unit change in the respective predictor variable (category)), the estimated coefficients (e.g. beta1) were expressed as percentage increase or decrease depending on whether the coefficient is positive or negative using the formula, (exp(beta1)-1)x100. The coefficients from the linear regression model are shown in S2 Table. We performed both univariate and multivariable analyses for all four outcomes (including all the predictors and random effects) to identify independent predictors of the modelled outcomes. The multivariate analyses produced adjusted effects as opposed to unadjusted effects from using univariate analysis.
A further separate analysis was carried out to assess the association between the most common comorbidities and combinations of comorbidities and the of risk of hospitalization and mortality. For this analysis, frequencies and percentages and unadjusted odds ratios (estimated using standard regression models) are reported for two outcome variables-hospitalization and mortality. STATA/SE 16.1 was used for all the analyses.

Ethical considerations
Data for the study was made available as part of Quantium Health's commitment to support research initiatives with broader public health significance. The company does not advise its clients on the clinical treatment of its members. The data was accessed in terms of and under the conditions set out in the agreement between Quantium Health and Discovery Health and a memorandum of understanding between Quantium Health and the study investigators.
All the data was provided in a de-identified format and aggregated at an individual level and the research team had no access to information that would enable the identification of any individual. All findings are presented at an aggregate level and no confidential member, health care provider or scheme information is disclosed. Ethics approval for the use of the database to carry out this study was granted by the Ethics Committee of the SAMRC (project registration number EC018-4/2021).

Sample description
The total dataset comprised the claims of an average total of 3,48 million individuals over the period 1 March 2020 to 28 February 2021. From this dataset, the claims data related to a total of 188,292 individuals who tested positive for COVID-19 over the period were extracted and analyzed. Of the total cases, 41.4% were aged between 40 and 65 years, 54.3% were female, 37.6% were diagnosed in Wave 1 and 51.1% were diagnosed in Wave 2, 40.9% were from Gauteng province, 65.6% had no comorbidities and 61.3% were on the Cover Level 3 plans (Table 1).

Hospitalization risk
The overall hospitalization risk for COVID-19 positive individuals was 18.8% (Table 1). Age, sex and number of comorbidities were found to be independent predictors of hospital admission. Patients aged above 65 years (aOR 4.31; 95%CI 4.02-4.62); who were males (aOR 1.19; 95%CI 1.17-1.23) and had more than three comorbidities (aOR 3.97; 95% CI 3.76-4.21) were more likely to be admitted to hospital. Pre-wave 1 period (aOR 1.49; 95%CI 1.38-1.61), postwave 1 (aOR 1.47; 95%CI 1.41-1.54), and wave 2 (aOR 1.18; 95%CI 1.15-1.21) all had a higher hospitalization risk compared to wave 1. Provincial differences in hospitalization risk were also observed with admissions more likely in Limpopo and the Northern Cape and less likely in Free State and Gauteng, compared with the Western Cape province. Health insurance cover level 4 (the most expensive plan with the highest level of insurance cover) was associated with a higher risk of hospitalization compared to plan level 1 (OR 1.22; 95% CI 1.16-1.27) in univariate analyses (Table 1). However, we did not include health insurance in the multivariable analyses because it was highly correlated with both age and number of comorbidities, which could have resulted in multicollinearity problems. Sixty-seven percent of individuals on plan level 4 were over the age of 40 and of those with more than 3 comorbidities almost a quarter (23%) were on plan level 4, compared with only 14% of those with 1 comorbidity.

Hospitalization utilization
The overall median length of hospital stay for COVID-19 positive individuals was 6 days (IQR 3-10) ( Table 2). In multivariable analysis, there was an increasing trend in length of hospital stay with age. Those aged over 65 years had a two-fold increased length of hospital stay compared with those less than 18 years (aIRR 2.00; 95% CI 1.89-2.12). Males had longer hospital stays than females (aIRR 1.08; 95% CI 1.06-1.09) and an increase in length of stay was observed for each additional comorbidity with individuals experiencing more than three comorbidities having the longest period of hospitalization (aIRR 1.14; 95% CI 1.11-1.18). Length of hospital stay was longer in wave 2 compared to wave 1 (aIRR 1.03; 95% CI 1.01-1.05). Provincial differences in length of hospitalization were observed with Eastern Cape, Free State, Gauteng, KwaZulu-Natal and North West all having significantly longer median length of hospitalization of COVID-19 patients compared to the Western Cape (Table 2). In the unadjusted model, assessing the effect of insurance cover, only insurance plan level 4 (the most expensive plan with the highest level of cover), was significantly associated with length of hospital stay but this effect was not significant in the adjusted model due to the variable being highly correlated with age and number of comorbidities. In univariate analysis, hospital network B had significantly longer hospital stays compared with network A hospitals, but the effect was not significant in the multivariable model.

Hospitalization expenditure
The overall median hospitalization cost per COVID-19 positive case was R49,836 (IQR R28,464-R107,020) ( Table 3). After adjustment for all other factors, there was an increasing hospitalization cost with each age category and those over age 65 years incurred a 172% increased cost of hospitalization compared to individuals under age 18 years (95% CI 153.45% -191.54%). The cost of hospitalization for males was 18% higher than that for females (95% CI 16.18%-20.92%). Cost of hospitalization increased for each additional comorbidity. Those with more than three comorbidities had 28% higher hospitalization costs compared with individuals with no comorbidities (95% CI 23.37%-33.64%). With regard to pandemic wave period, hospitalization during wave 2 was 7% more costly compared to the wave 1 period (95% CI 4.08% -9.42%). With regard to provincial differences, Gauteng and KwaZulu-Natal were both significantly more costly than the Western Cape (11% and 4% more costly respectively), whilst Limpopo and the Northern Cape were less costly compared to the Western Cape (14% and 10% less costly respectively). Hospitalization cost differences were noted between insurance plan levels, with plans 2, 3 and 4 being 8% more costly compared with level 1 plans We also observed differences in cost across hospital networks after adjusting for all other covariates. Hospital networks C, D, E and F were all significantly less costly compared to network A (27%, 12%, 17% and 12% less costly respectively) ( Table 3).

Risk by comorbidity condition type
Of the conditions considered as comorbidity factors, Diabetes Mellitus (on its own or in combination with other comorbidities) carried the highest hospitalization risk (OR 3.6; 95% CI 3.27-3.94 for Diabetes Mellitus only; OR 6.6; 95%CI 5.88-7.43 for Diabetes with hypertension, hypercholesterolemia and Ischemic heart disease) ( Table 4). In terms of mortality risk, the combination of diabetes with hypertension, hypercholesterolemia and Ischemic heart disease carried the highest mortality risk (OR 10.25; 95% CI 8.57-12.27). Hypertension in combination with heart disease (OR 6.94; 95% CI 5.66-8.51) or cancer (OR 6.10; 95% CI 4.47-8.33) also carried an increased risk for mortality (Table 4).

Discussion
This is the first study describing risk factors for COVID-19 hospitalization and mortality and hospitalization related utilization and expenditure amongst a private health insured population in Africa. From a study population of 188,292 COVID-19 cases, we found overall hospitalization rates and mortality rates of 18.8% and 3.3% respectively. COVID-19 positive individuals above the age of 65 years, those with 3 or more comorbidities and males had the highest risk across all 4 outcome measures. Overall, in line with studies carried out elsewhere [24], the findings suggest that the strongest predictors for COVID-19 related hospitalization, mortality [10,11], hospital related utilization [27] and expenditure [28] was age, followed by the number of comorbidities and then sex.
Regarding specific comorbidities, diabetes alone or in combination with hypertension, hypercholesterolemia and ischemic heart disease carried the greatest risk for hospitalization and death. These comorbidity risk factors for severe disease and death are similar to other settings. In contrast to research amongst public sector COVID-19 patients in South Africa [13], we did not find an association between HIV and mortality reflecting the different underlying disease profile of the private sector population. Around 4.7% of the private insured population in South Africa are registered on an HIV management program [25] whilst the HIV prevalence rate in the general population is 14% [29]. Another recent study in South Africa exploring risk factors for COVID-19 related in-hospital mortality found an HIV prevalence amongst hospitalized Covid-19 patients of 20.4% in the public sector and 2.2% in the private sector [16].
Research based on data from the national surveillance system, including both public and private sector patients, has reported a case fatality risk of 18.7% amongst hospitalized COVID-19 patients in the private sector and 27.5% amongst public sector patients [14]. Our mortality risk does include some deaths (509) amongst individuals who were never hospitalized although almost all (92%) of the deaths in our sample occurred in hospital. Amongst hospitalized cases in our study the mortality risk is 16% (5742/35467) which compares well to the rate reported from national surveillance amongst private sector COVID-19 admissions. Differences in mortality risk between the public and private sector are expected due to differences in underlying disease profiles of patients, resourcing and case load differences. Provincial variation in all four outcome measures were found, even after adjustment for all other factors. This reflects differences in clinical practice between private hospitals which may not necessarily adhere to national Department of Health COVID-19 protocols. There may also be underlying differences in health seeking behaviour across provinces and different thresholds applied by general practitioners regarding when to admit patients to hospital. With regard to differences in hospitalization cost, Gauteng was almost 12% more expensive compared to the Western Cape. This is likely due to the reduced plan costs for coastal versus inland hospitals under the DH plan options [30].
We found higher rates of hospitalization and mortality and longer duration of hospital stay during wave 2 compared with wave 1. This is similar to findings from the national surveillance system and is likely due to the higher incidence of COVID-19, greater pressure on hospitals and the emergence of the Beta variant [14].
In unadjusted analysis, individuals on level 4 insurance plans were found to have an increased risk of both hospitalization and mortality. This was related to individuals on these plans being older and with more comorbidities reflecting the trend for individuals to buy more expensive and comprehensive medical insurance as they get older and sicker [31].
Hospital networks also differed in the cost of COVID-19 care after adjustment for risk factors. While this could be due to differences in the reimbursement rates across the various hospital network groups, variation in the underlying clinical management of patients across hospital groups is likely to have played a role.
This analysis has a number of limitations. Only services for which claims were submitted were analysed. This could result in under-recording of the COVID-19 cases, particularly of "milder" cases and could result in an overstatement of the reported outcome measures. As this data set is from a period prior to the mass distribution of home-based COVID-19 self-test kits, COVID testing was by doctor referral available under private insurance at no charge where the patient tested positive. For those paying out of pocket the test was relatively expensive (US$55 per test). It is possible that some patients with private voluntary insurance elected to seek testing in the public system or pay out of pocket for the test in which case they would be missed from the data set. However, we consider that this number would be very small as there was strong incentive for those insured to utilize their insurance benefits, and the requirement for a doctor referral also strengthened capture of data. We further note that (1) our study population was limited to those who tested COVID-19 positive and (2) all the statistics that we present are based on those who tested positive (not the entire insured population). In addition, the costs do not reflect additional out of pocket expenditure for claims not covered under the benefits of certain health plans, for example the use of the pharmaceutical Remdesivir as an adjunct to existing treatments covered by the insurance plan.
Obesity and smoking, identified as risk factors in other studies, could not be included in our analysis due to limited or incomplete coding of these risks in our dataset. We have however included the co-morbidity chronic obstructive pulmonary disease and other large analyses that have included both smoking status and chronic pulmonary disease in adjusted models have noted a mediating effect which limits the ability to explore the independent association of smoking status [32]. Vaccination status could also not be included due to the timeframe for the analysis in relation to the vaccine roll-out. Vaccination roll-out in South Africa began in May 2021 with individuals over the age of 60 years and our analysis included hospitalization data up until 20 June 2021.
The potential for overcrowding in health facilities leading to increased death due to COVID-19 represented a significant concern for South Africa's COVID-19 response. Differentiating risk by pandemic wave period allowed this analysis to partially account for impact due to overcrowding, however it is unknown whether overcrowding had a differentiated impact on COVID-19 outcomes contingent upon particular risk factors.
This analysis has significant policy implications for both private and public sectors in South Africa for targeted, risk-based interventions and for reducing unwarranted variation. The outputs from the study can provide a basis for developing "risk calculators" to enable providers and funders to develop risk-based management strategies. Risk information can also inform broader policy, including risk stratification for employer "work from home" policies and other initiatives to reduce risk of infection.
The finding of provincial and hospital group variation on outcomes after adjusting for other risk factors is in line with the findings of the Competition Commission's Health Market Inquiry which identified this variation as a major source of private sector inefficiency in South Africa [33]. The results highlight the difficulties related to efforts to contain health system costs, with complex dynamics between independent clinician judgment, hospital groups, and insurance plan types in a system with few mechanisms for standardization, and even built-in efficiency impediments-for example health insurance providers are by law required to negotiate separately with individual hospital groups. They also highlight the need to identify and minimize unwarranted variation through the implementation of protocols which are evidence based, effective and cost-effective across the private sector. In addition, the analysis provides a basis for determining the cost effectiveness of different treatment and preventative interventions-an understanding of the expected cost and mortality risk in the South African context for different patient populations following a COVID-19 diagnosis enables realistic estimations about how best limited resources can be used in developing clinical guidelines and protocols.
Finally, this analysis demonstrates the untapped potential of RWD in health policy decision making and planning. Through the use of relatively simple multiple regression analysis of a substantive data set, insights were achievable in terms of variation on a range of outcomes, much of which would be unachievable in even a large-scale clinical trial. Using more extensive data analysis and time series, this dataset may also enable testing of various interventions and disease management dynamics, most notably the impact of vaccination and other treatment strategies. This analysis was conducted with ethics approval and maintained confidentiality of individual patient data which demonstrates a workable model of analysis. The larger lesson for health systems is that data systems across public and private sectors must be improved to enable use of data and analytics in decision making. Traditional analytical approaches for informing health investment and planning decisions rely on modeled prevalence and epidemiological information with intervention effects from published literature. While this approach enables evidence-informed decision making, it will always be limited to the extent that it reflects the actual context of the health system, inefficiencies and variations. The advancing digitization of health systems means that the routine use of health system information for investment, planning, efficiency and quality improvement is possible if the appropriate structures are present.

Conclusion
The information from this study, with one of the largest private sector patient datasets, can assist in developing better risk mitigation and management strategies. It can also allow for better resource allocation planning and prioritization strategies as health care systems struggle to meet the immediate and longer term increased health care demands resulting from the COVID-19 pandemic while having to deal with these in an ever-more resource constrained environment [34].
Supporting information S1